OpenAI launches GPT-5.4-Cyber days after Anthropic’s Mythos reveal

Just days after Anthropic’s Mythos reveal, OpenAI is scaling its identity-verified programme to grant defenders access to GPT-5.4-Cyber, a model fine-tuned for high-level autonomous vulnerability identification.

Wednesday April 15, 2026 , 6 min Read

ChatGPT maker OpenAI has announced GPT-5.4-Cyber, a specialised version of its latest AI model that is designed for defensive security work. The development comes a week after rival AI firm Anthropic unveiled its own security model, Claude Mythos Preview.

The introduction of GPT-5.4-Cyber highlights what OpenAI calls the next era of cyber defence. Unlike general purpose models that are often trained to refuse any requests involving computer code that looks even slightly suspicious, this new variant is described as being cyber-permissive.

This means the AI has lower refusal boundaries for legitimate security tasks. It is designed to understand that a security professional trying to find a flaw is doing something helpful, not harmful.

One of the notable features of this model is its ability to perform binary reverse engineering. This is a technical process where an expert looks at compiled software, the version of a programme that a computer runs, to understand how it works without having access to the original blueprints or source code. By automating parts of this task, the AI can help defenders spot hidden malware or vulnerabilities in software that was previously very difficult to analyse.

Alongside the new model, OpenAI is expanding its Trusted Access for Cyber (TAC) programme. This initiative is moving from a small pilot to a large operation involving thousands of verified individual defenders and hundreds of specialised teams. The goal, according to OpenAI, is to get these powerful tools into the hands of those who protect critical infrastructure and public services.

OpenAI requires identity verification for participants. By verifying the identity of defenders, OpenAI believes it can offer them more permissive models that have fewer safety filters, allowing them to do their jobs without the AI constantly blocking their requests as potentially malicious.

Comparing the titans

The timing of this release is not a coincidence. One week ago, Anthropic announced Project Glasswing, which is built around its Claude Mythos Preview model. Anthropic claimed that its model has reached a level of coding skill where it can surpass almost all humans at finding software flaws. It noted that the model had already autonomously discovered thousands of vulnerabilities in every major operating system and web browser.

While Anthropic is working with a relatively small group of partners like JPMorganChase, NVIDIA, and Google, OpenAI is pushing for what it calls democratised access. OpenAI’s strategy focuses on making advanced defensive capabilities available to a much broader range of legitimate actors, from large corporations to small security research teams.

Both companies are also putting their money where their mouth is, with Anthropic committing up to $100 million in usage credits and OpenAI pledging $10 million in API credits to support defenders.

Brief history

To understand why GPT-5.4-Cyber is such a major step, one must look at how OpenAI’s philosophy has changed over the years. Back in 2019, the company famously chose not to release its GPT-2 model because of fears it could be used for malicious purposes. It viewed AI development as a sudden leap that could be dangerous if handled incorrectly.

Today, the company embraces a principle called iterative deployment, essentially an idea that the best way to make AI safe is to release it in small, controlled stages and learn from how people actually use it in the real world. The process allows society and safety systems to adapt as the models become more capable.

Since 2023, OpenAI has been steadily building toward this moment by launching its Cybersecurity Grant Program and testing agentic coding tools.

One of the precursors to this new model was Codex Security, an automated agent that monitors codebases and suggests fixes. In its initial research preview, Codex Security was used to identify and patch over 3,000 critical and high-severity vulnerabilities.

Preparedness framework

OpenAI uses something called the Preparedness Framework to track and prepare for risks. This framework evaluates models in several categories, including their ability to assist with biological threats or cybersecurity attacks. GPT-5.4 is the first model series that OpenAI has officially treated as having high cybersecurity capability.

Under this framework, a high capability designation means the model might be able to automate the discovery of operationally relevant vulnerabilities or even carry out end-to-end cyber operations. Because the model is skilled, OpenAI activates specific safeguards, including training the model to refuse requests to steal credentials or create malware, while also using automated monitors to watch for suspicious activity.

Focus on defence

A central concept in this entire strategy is defensive acceleration. The traditional view in security is that the attacker has the advantage because they only need to be right once, while the defender must be right every time. AI has the potential to change this power dynamic.

By using models like GPT-5.4-Cyber and Claude Mythos, defenders can find and fix vulnerabilities at a scale and speed that was previously impossible for human teams alone.

The idea is to shift security from a series of occasional audits to a constant, ongoing process where flaws are fixed as soon as code is written. This is often called ‘shifting left’ in the industry, meaning security is handled at the beginning of the software development process rather than at the end.

OpenAI acknowledges that cyber capabilities are inherently dual-use, meaning the same intelligence that helps a defender can also be used by an attacker. It also admits that the company’s safety systems are not perfect and could be bypassed by highly skilled actors using jailbreaks, a technique where a user tricks an AI into ignoring its rules by using clever or adversarial prompts.

Because of this residual risk, OpenAI believes that restricting the most permissive models to verified, trusted users is the only responsible way to move forward.

OpenAI expects that AI models will only become more powerful and believes that the class of safeguards it uses today is sufficient for current models, but it’s already working on more expansive defences for the next generation.

As these two tech giants, OpenAI and Anthropic, continue to compete, the ultimate winners may be the everyday users of the internet. If these AI models can be used to patch the 27-year-old vulnerabilities that have survived decades of human review, the digital world may become a much harder place for hackers to hide.

Edited by Affirunisa Kankudti

Advertise with us